HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HPC-CLUST: Distributed hierarchical clustering for very large sets of nucleotide sequences

Motivation: Nucleotide sequence data is being produced at an ever increasing rate. Clustering such sequences by similarity is often an essential first step in their analysis – intended to reduce redundancy, define gene families, or suggest taxonomic units. Exact clustering algorithms, such as hierarchical clustering, scale relatively poorly in terms of run time and memory usage, yet they are de...

متن کامل

HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences

MOTIVATION Nucleotide sequence data are being produced at an ever increasing rate. Clustering such sequences by similarity is often an essential first step in their analysis-intended to reduce redundancy, define gene families or suggest taxonomic units. Exact clustering algorithms, such as hierarchical clustering, scale relatively poorly in terms of run time and memory usage, yet they are desir...

متن کامل

Approximating Hierarchical MV-sets for Hierarchical Clustering

The goal of hierarchical clustering is to construct a cluster tree, which can be viewed as the modal structure of a density. For this purpose, we use a convex optimization program that can efficiently estimate a family of hierarchical dense sets in high-dimensional distributions. We further extend existing graph-based methods to approximate the cluster tree of a distribution. By avoiding direct...

متن کامل

Efficient Hierarchical Clustering of Large Data Sets Using P-trees

Hierarchical clustering methods have attracted much attention by giving the user a maximum amount of flexibility. Rather than requiring parameter choices to be predetermined, the result represents all possible levels of granularity. In this paper a hierarchical method is introduced that is fundamentally related to partitioning methods, such as k-medoids and k-means as well as to a density based...

متن کامل

Self-Organizing Clustering: A Novel Non-Hierarchical Method for Clustering Large Amount of DNA Sequences

To cluster and characterize DNA sequences focusing on the oligonucleotide frequency, we developed a novel method and program package designed designated as Self-Organizing Clustering (SOC) [4]. Being based on Self-Organizing Map (SOM) [1, 2, 3], the algorithm of SOC made use of K-means with modification. In the SOC, the oligonucleotide frequency was regarded as a series of oligonucleotide patte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Bioinformatics

سال: 2013

ISSN: 1460-2059,1367-4803

DOI: 10.1093/bioinformatics/btt657